Analytical Methods for Causality Evaluation of Photonic Materials

We comprehensively review several general methods and analytical tools used for causality evaluation of photonic materials. Our objective is to call to mind and then formulate, on a mathematically rigorous basis, a set of theorems which can answer the question whether a considered material model is causal or not. For this purpose, a set of various distributional theorems presented in literature is collected as the distributional version of the Titchmarsh theorem, allowing for evaluation of causality in complicated electromagnetic systems. Furthermore, we correct the existing material models with the use of distribution theory in order to obtain their causal formulations. In addition to the well-known Kramers–Krönig (K–K) relations, we overview four further methods which can be used to assess causality of given dispersion relations, when calculations of integrals involved in the K–K relations are challenging or even impossible. Depending on the given problem, optimal approaches allowing us to prove either the causality or lack thereof are pointed out. These methodologies should be useful for scientists and engineers analyzing causality problems in electrodynamics and optics, particularly with regard to photonic materials, when the involved mathematical distributions have to be invoked.


Introduction
The time-domain response of electromagnetic system has to be causal [1][2][3][4], i.e., a system cannot respond before the excitation starts. Hence, causality restricts the characteristics of any system, not only in the time domain but also in the frequency domain. Therefore, it is often stated in literature that real and imaginary parts of various complex parameters and characteristics (e.g., susceptibilities and frequency responses) are related by the Kramers-Krönig (K-K) integral relations [4]. The K-K relations are also valid between the modulus logarithm and the argument of the frequency-domain response (e.g., between the phase delay and the attenuation of a solution to a wave-propagation problem). However, it occurs that integrations in the K-K relations are not straightforward, and sometimes even impossible to perform. It stems from singular, highly oscillatory, or even diverging integrands, and infinite integration limits in the Hilbert transformation [5]. Therefore, the integrals cannot be evaluated in many important practical cases because they are divergent or undefined [6]. The problems concerning causality and the K-K relations are studied not only in electrodynamics [7][8][9][10][11] but also in acoustics [12][13][14][15][16], solid mechanics [17,18] and the control theory [19].
Although the subject of causality and dispersion of electromagnetic waves in photonic materials (e.g., dielectrics) was initiated in 1927 by the seminal papers of Kramers and Krönig [20,21], it has remained active in literature up to now [22]. Furthermore, classical books, devoted solely to this subject, are published [4]. As one can notice, although the book [4] was published in 1972, it has remained the initial point for many investigations referring to the subject, and constitutes the basic reference in this research area. On the other hand, a discussion related to origins of the Titchmarsh theorem recently appeared in literature [23]. This theorem establishes the equivalence between causality and the K-K relations. It occurs that it is a compilation of two important theorems, i.e., the Paley-Wiener theorem and the Marcel Riesz theorem. Therefore, the Authors of [23] have perceived the need for studying the subject with the use of rigorous mathematical tools. This approach, although often longer than the one presented in physics literature, has the advantage of being more precise.
Let us present some recent results related to causality in electrodynamics and optics. In [17], the K-K relations, which are valid for a general class of linear homogeneous or inhomogeneous media, are derived. That is, the proof of the K-K relations proceeds without a priori knowledge of wave velocity in the medium supporting the wave, when the frequency goes towards infinity. In [10], the K-K relations are derived for the effective index of modes propagating in optical waveguides. When material dispersion and absorption can be neglected within the frequency range of interest, the evanescent modes introduce an effective loss term in the K-K relations, meaning that these relations are valid even if material absorption is negligible within the frequency range of interest. In [11], a novel approach to the standard one is proposed for the derivation of the K-K relations for linear optical properties. That is, this approach is not based on contour integration and the Cauchy integral formula. Although this derivation still employs analytic behavior of the property under consideration, it employs only elementary properties of the Hilbert transformation to obtain the second formula of the K-K pair, from the Herglotz representation of the optical property as a Herglotz function. In [24], linear-response laws and causality within the time and frequency domains are analyzed in electrodynamics. It is demonstrated that one can violate causality in the frequency domain by making a vanishing-absorption approximation. Our contribution to this subject relies on generalization of the K-K relations for evaluation of causality in power-law media. In general, for square-integrable functions of the frequency, the validity of classical K-K relations is equivalent to causality in the time domain [1,4,17]. Things get complicated when the K-K relations are verified between the modulus logarithm and the argument. Then, the considered function does not belong to the class of square-integrable functions, and one can employ classical K-K relations with subtractions, but their satisfaction does not mean that the originally considered function is causal [13,17]. That is, the dispersion relation can be formulated for the considered system. It is based on the assumption that the subtracted logarithm of the response is causal, but the considered response function does not have to be causal. Therefore, we address this issue in [25], where the K-K relations are generalized towards non square-integrable functions. That is, the K-K relations with one and two subtractions are formulated. Their validity and satisfaction of additional assumptions imply causality of the considered function. Then, the formulated theory is applied to the analysis of electromagnetic media characterized by power-law frequency dispersion [25] and fractional-order (FO) models [26]. However, in our opinion, it might still be unclear for community members how to apply the developed mathematical tools to causality evaluation. Therefore, we have decided to set in order the theory of causality and prepare the review of various analytical techniques allowing for evaluating causality of photonic materials. It includes the classical methods and models known from electrodynamics handbooks [7], as well as recently published results on causality evaluation [25,26]. Furthermore, we also consider various causality tests, which allow for establishing minimal limits for losses, for metamaterials [27,28]. Therefore, one can easily find their own approach to causality evaluation when a new model is formulated in classical electrodynamics. How-ever, it is worth noticing that the procedure of canonical quantization of macroscopic electromagnetism [29] requires satisfaction of the K-K relations by dielectric functions of a linear, inhomogeneous, magnetodielectric medium. Hence, our review should hopefully also be useful in quantum-electrodynamics research. The paper begins with a short introduction of the notation, definitions, and basic mathematical theorems used throughout the paper. In Section 3, the problem of causality is presented, with various mathematical models applicable in electrodynamics and optics. However, we do not focus on physical motivations for these models, but we rather formulate mathematical approaches allowing for causality evaluation. Analytical methods for causality evaluation are presented in Section 4, whereas a set of examples is presented in Section 5. In this paper, the most important models of photonic materials are analyzed in terms of causality with the use of precise mathematical tools. Furthermore, the set of various distributional theorems presented in literature is collected as the distributional version of the Titchmarsh theorem, allowing us to evaluate causality of complicated electromagnetic systems on a mathematically rigorous basis. Hopefully, researchers interested in evaluating causality of novel models can use our paper as a template for their own mathematical derivations and proofs.

Basic Notations
In this paper, a standard engineering notation is employed, which denotes an imaginary unit as j = √ −1. For the complex number s = u + jv, we denote its real part as s = u, and its imaginary part as s = v. The (open) right half-plane is denoted as C + = {s ∈ C : s > 0}. We denote the space of all functions which are holomorphic in the right half-plane and bounded by a polynomial as H + . To be more precise: G ∈ H + means that the function G is holomorphic in the right half-plane, and there exist l ∈ N and a constant A(σ) such that for all σ > 0 for all s ∈ C + satisfying s ≥ σ > 0. Then, we define the Fourier transformation for the absolutely integrable function f (t), using the formulation applied in electrical sciences (2) whereas the inverse Fourier transformation is given by In order to change the settings of the Fourier transformation to the one widely applied in mathematics and physics, one should replace the imaginary unit j with −i in (2) and (3) (where i = √ −1) [8]. Various literature sources which we refer to sometimes apply the mathematical i-convention; hence we convert those results to the engineering j-convention.
Among various elementary functions, we often refer to the Heaviside step function u(t), defined as u(t) = 1 for t ≥ 0 and u(t) = 0 for t < 0. Then we refer to the signum function sgn(t) defined as sgn(t) = t/|t| for t = 0 and sgn(t) = 0 for t = 0. One should be aware that these functions appear as regular tempered distributions and, as such, should be identified up to the Lebesque measure zero sets. Hence, in both cases, it actually does not matter how the functions are defined for t = 0.
In this paper, we often refer to the concept of Hölder continuous functions. We call the function f : R → C a Hölder continuous one (with an exponent α ∈ (0, 1]), if there exists such a constant L > 0 that for all t 1 , t 2 ∈ R. If the condition (4) is satisfied on every interval [−M, M] ⊂ R (with a constant L = L(M) possibly depending on M), we can say that the function is locally Hölder continuous. We should mention here that the case α = 1 corresponds to Lipschitz continuous functions. One can also notice that the class of locally Hölder continuous functions covers all the functions with a continuous derivative, but it is essentially larger (with the function f (t) = |t| as an example of Hölder continuous function with the exponent α = 1/2). In this paper, the Hilbert transformation [5] of the function f : R → R is applied. It is defined as The classical domain of this definition is the space L q (R) for any q ∈ (1, +∞). The definition (5) can be formulated for a wide class of distributions, which is discussed below.

Fractional Calculus
In modeling of dielectrics, one can find FO derivatives, hence the notation is fixed below. Various approaches to fractional calculus are considered in literature, i.e., Riemann-Liouville, Caputo, Grünwald-Letnikov and Marchaud, to mention just a few. For appropriate definitions, refer to classical monographs [30][31][32]. In this paper, when the fractional derivative of order α > 0 is applied, its Marchaud definition is used (refer to [31] Sections 5.4 and 5.5) for α = n + {α} (n ∈ N ∪ {0} and {α} ∈ (0, 1)). In (6), the function f is assumed to be sufficiently smooth, e.g., f ∈ C n+1 (R) with | f (n) | bounded by a function which is not growing too quickly in ±∞.
The main reasons behind the usage of the Marchaud derivative in our considerations are as follows [33,34]: • The Marchaud derivative of the order α ∈ (0, 1) for the function f exists, if f : R → R is bounded and locally Hölder with an appropriate exponent. • For the Marchaud derivative of the order α of the exponential function e jωt (where ω ∈ R is fixed), one obtains D α e jωt = (jω) α e jωt .
• The Marchaud and Grünwald-Letnikov derivatives coincide for a very broad class of functions. • The Marchaud derivative satisfies the semigroup property for all the f functions for which this definition coincides with the Grünwald-Letnikov definition, i.e., It is demonstrated in [33,34] that, in order to obtain the equivalence between the results in the time and frequency domains, the FO derivative modeling electromagnetic systems should be representable in the phasor domain (i.e., satisfy (7)) and satisfy the semigroup property. From this point of view, we considered in [33,34] the following definitions of FO derivatives applied for the electromagnetic modeling: Riemann-Liouville, Caputo, Liouville-Caputo, Liouville, Marchaud, Grünwald-Letnikov, Caputo-Fabrizio, Atangana-Baleanu, Atangana-Koca-Caputo and the conformable derivative. Out of these most popular approaches, only the Grünwald-Letnikov and Marchaud definitions (which are actually equivalent for a wide class of functions) satisfy the semigroup property and are naturally representable in the phasor domain. Therefore, we employ the Marchaud derivative in our research focusing on causality evaluation of photonic materials.

Distribution Theory
The distribution theory is applied in our investigations; therefore the notation is fixed below. It is a very formal mathematical theory, which is widely applied in many branches of science and engineering. However, the very formal formulation of this theory, e.g., proposed in purely mathematical books [35,36], is not straightforwardly applicable in applied physics and engineering. Even though alternative attitudes allow for different views on distributions, they are based on the same foundations. Therefore, we refer a reader to literature sources [37][38][39], which provide different perspectives on the distribution theory.
From now on, D denotes the space of test functions of the class C ∞ (R) with compact support, endowed with appropriate topology (i.e., with the formal definition of all convergent sequences within the space of test functions). The topology is given by an appropriate family of seminorms, as described in Section 1.2.6 of [38]. The space dual to D, i.e., the space of distributions, is denoted as D . The linear continuous functional f on D is denoted using the dual-pair notation f , ϕ , where f ∈ D and ϕ ∈ D. The space of Schwartz functions, i.e., rapidly decreasing functions, is denoted as S. Its dual space, i.e., the space of tempered distributions, is denoted as S . The Fourier transformation is defined for all tempered distributions by the formula [38] for all ϕ ∈ S.
Let E denote the space of all C ∞ (R) functions. Then its dual E is the space of distributions with compact support.
We should also refer to some spaces of test functions (and related spaces of distributions), which are useful when discussing the Hilbert transformation. Let us take q ∈ (1, +∞) and q such that 1/q + 1/q = 1. Then, the space of test functions D L q ⊂ C ∞ (R) consists of all ϕ ∈ C ∞ (R) such that all the derivatives ϕ (k) ∈ L q (R) for all k = 0, 1, 2... with the topology defined by the family of seminorms inherited from Sobolev spaces (for details, one is referred to (1.57) in [35]). Let D L q denote the space dual to D L q . The special case is q = 2 with D L 2 being dual to D L 2 . The derivative of the order k of the function (or distribution) f is denoted as D k f . For the space D L q , one can formulate the following theorem: One can find a detailed discussion of properties of the spaces D L q in Section 10.2 in [5]. The relation between different subspaces of the space of distribution D can be summarized as (see [5] Section 10.2) E ⊂ D L q ⊂ D L r ⊂ S ⊂ D (11) where q ≤ r. Let us define the distribution p. v.(1/x), which is needed in the sequel, as follows: As shown in [5] Section 10.7, the distribution p. v.(1/x) belongs to D L q for all q ∈ (1, +∞).
In [35] Lemma 1.8, the space S 0 is defined as the set of all tempered distributions such that F (S 0 ) = D L 2 . Following this definition, S q is defined as the space of such tempered distributions that F (S q ) = D L q , and the Fourier transformation is a one-to-one mapping between S q and D L q .
If one considers two distributions f , g ∈ D , then its convolution is not always welldefined (refer to the discussion in Section 10.6 of King's book [5]). However, when it is possible to define the convolution of distributions f and g, it is defined as the distribution h such that for the test function ϕ ∈ D. There are some cases, when the convolution of distributions is well-defined (refer to the end of Sections 10.6 and 10.7 in [5]), that is: if f ∈ D L p , g ∈ D L q and 1/p + 1/q ≥ 1 then f * g exists, and f * g ∈ D L r , where The Hilbert transformation of the distribution f ∈ D L q is defined as the distribution H( f ) (see [5] Equation (10.83)) satisfying the formula where ϕ ∈ D L q . For the function f ∈ L q (R), the Hilbert transformation can be written as the convolution which is defined in the distributional sense (see [5] Equation (10.102) and the following discussion ending at Equation (10.121)). We should bear in mind some important properties of the distributional Hilbert transformation. That is, the following properties are valid for q ∈ (1, +∞): • For any distribution f ∈ D L q , one obtains (cf. Section 10.9 in [5]) • Let f ∈ L q (R) be a function. Then one obtains (see (4.30) in [5]) • For any f ∈ D L q , the Hilbert transformation commutes with the distributional derivative (see (10.173) and the entire Section 10.10 in [5]) • The Hilbert transformation of the Dirac delta is given by (see (10.85) in [5]) • The distribution p. v.(1/x) belongs to D L q , and one obtains (see (10.87) in [5]) • A similar result to the one given above can be stated for the translated distribution Although the above property is quite obvious, we were not able to find it given directly in literature. Therefore, a short proof of this property is given. Let us take the test function ϕ ∈ D L q and check (we use the definition (14), the variable change property (17) for the transformation of L q (R) function, and additionally the change of variables x − a → x) which completes the proof of (21).
We also need to remember topology in the space S of tempered distributions: One says that the sequence of tempered distributions for any Schwartz function ϕ ∈ S [37].

Causality in Electrodynamics and Optics
Let us consider Maxwell's equations in the time domain where E and H denote, respectively, the electric-and magnetic-field intensities, D and B denote, respectively, the electric-and magnetic-flux densities, J and ρ denote, respectively, the current and charge densities. In this paper, we focus on dielectric properties of isotropic electromagnetic media, i.e., photonic materials, whose properties are described in the frequency domain. It stems from the fact that the number of models of dielectric properties of media is much larger than for magnetic properties. For such mathematical models, it is crucial to approximate physical characteristics by using causal formulas. However, the presented methods and tools can be extended towards magnetic characteristics. Furthermore, the obtained solutions of Maxwell's equations should also be causal. Finally, having the frequency response of media in the wave-propagation (or wave-guiding) problem, one can evaluate causality using the theorems presented below, supported by methods of their usage.

Basic Definitions
The function f : R → R (generally f : R → C) or the distribution f ∈ D is called causal if its support supp( f ) ⊂ [0, +∞). The Fourier transform F = F ( f ) is called a causal transform if supp(F −1 (F)) ⊂ [0, +∞). In other words, one can assume that f (t) is causal if it can be represented as f (t) = u(t)g(t), where u(t) is the Heaviside step function and g(t) = f (t) for t > 0 [40]. In practical terms, one should also assume that g(t) is the function whose Laplace transform has a non-degenerate region of convergence.
In terms of physics, causality means that the effect does not precede the cause. Hence, if one considers the electromagnetic system whose time-domain function f (t) describes the system response to the Dirac delta excitation, then causality means that f (t) = 0 for t < 0. It also means that response of the system depends only on excitation values from the past. To sum up, the mathematical definitions of causality formulated above closely follow physical understanding of this term.

Dielectric Models
Transformation of (23)- (26) into the frequency domain gives ∇ ·D =ρ (27) ∇ ×Ẽ = −jωB (28) ∇ ·B = 0 (29) ∇ ×H = jωD +J (30) where tilde denotes phasor representation of the physical quantity, i.e., a(t) = ãe jωt , and ω denotes the angular frequency. To solve this set of equations, one additionally needs constitutive relations betweenD andẼ as well as betweenB andH. Hence, one can writẽ Then one can also write that whereP = 0 χ e (ω)Ẽ andM = µ 0 χ m (ω)H are, respectively, the electric and magnetic polarizations, 0 and µ 0 are, respectively, the vacuum permittivity and permeability, χ e (ω) and χ m (ω) are, respectively, the electric and magnetic susceptibilities of the medium. As it has already been mentioned, our considerations are focused on isotropic dielectric media. Therefore, below we consider general χ(ω) functions, which are mainly related to dielectrics, but can also represent, in an obvious way, magnetic models of media (i.e., magnetic susceptibility). In electrodynamics, it is required that χ(ω) is a causal transform [7], because all dielectrics are not able to polarize instantaneously in response to an applied field. Alternatively, one can require that the function where r is the constant relative permittivity. Then let us assume the ohmic conduction, i.e., the current flow whose density depends on the electric-field intensity as follows: In (36), σ ∈ R denotes the electrical conductivity. Therefore, assuming that ω = 0, (30) can be written as where (ω) = 0 r − j σ ω (38) includes ohmic losses within the complex permittivity (ω). This result can be generalized towards where (ω), (ω) ∈ R are functions of frequency and (ω) describes all losses in the electric field. If the losses in a dielectric material stem from ohmic conduction, then (ω) is proportional to 1/ω. Otherwise, if losses stem from the bound charge and dipole relaxation phenomena, then (ω) is not proportional to 1/ω. Analogously, one can write for the permeability where µ (ω), µ (ω) ∈ R are functions of frequency. Let us formulate a few popular models describing the response of dielectrics with the use of permittivity: • Djordjevic-Sarkar relationship for lossy dielectrics [6,41] (ω) = 0 where ω = 0. The first term in (41) is the relative permittivity at very high frequencies, the second term is the broadband logarithmic term, and the third term comes from conductivity. In (41), ∆ r , ω 1 , ω 2 are model parameters. This formula gives a simple closed-form expression, which approximates the measured permittivity of the popular FR-4 substrate used for manufacturing printed circuit boards. • Westerlund relationship for FO capacitors [42,43] (ω) = where ω = 0. It allows for formulating the constitutive relation (31) in the time domain with the use of FO derivative as Although this model does not explain the nature of internal processes in dielectrics, it reproduces and predicts their behavior much better than any other theory (according to the Authors of [42]). Therefore, it is referred to as an 'engineering' model of dielectrics. Furthermore, this model allows for obtaining the electrical characteristics of FO capacitors, (refer to [43,44]). • Power-law relationship for porous media [45] ( where ω = 0, (ω) ∈ R, A is a constant, and α is close to 1.0 in a low frequency region, and is within the range of 0-0.5 in a high frequency region. The model (44) describes the permittivity of porous media such as wet soils and sedimentary rocks, which has been observed to be considerably different than in the case of water and parent minerals.

Dielectric Relaxation
Let us formulate several popular dielectric models based on electric susceptibility χ(ω) in the complex domain (τ > 0 is the relaxation time, and ∆ r = s − 0 0 where s = (ω → 0 + )): • Debye [46,47] This model is frequently used to describe simple dielectric characteristics of electromagnetic media arising from bipolar relaxation. The Formula (45) is characterized by a single relaxation time, which is capable of handling materials with high-water content. However, experimental studies show that the relaxation behavior of a wide range of dielectrics strongly differs from the Debye relaxation formula. Therefore, a number of phenomena such as broadness, asymmetry and excess in dielectric dispersion has motivated the development of empirical response functions described below, such as Cole-Cole, Cole-Davidson, Havriliak-Negami, and Raicu [46]. • Lorentz [7,47] where ω 0 is the frequency of a pole pair (the undamped resonant frequency of the medium), and γ is the damping coefficient. The model is based on the classical theory of light-matter interaction, and describes the frequency-dependant polarization due to bound charges. That is, bindings between electrons and nucleus in atom are treated similarly to those of the mass-spring harmonic-oscillator system. It is worth mentioning that any function obeying the K-K relations can be approximated as a superposition of Lorentzian functions, to any precision [48]. Therefore, the Lorentzian function (46) can be considered as a general building block for implementing causal susceptibilities of various materials, e.g., metamaterials. • Lorentz in high-frequency limit [7] where ω = 0 and ω p is the plasma frequency of medium. This model results from (46), assuming that the frequency is far above the highest resonant frequency ω 0 in the medium. • Lorentz in high-frequency limit with static magnetic induction [7] where ω = 0 and ω = ∓ω B . In (48), ω B is the frequency of precession of a charged particle in magnetic field. This model is the extension of (47), which involves an interaction between static magnetic field B 0 and tenuous electronic plasma of uniform density, when transverse waves propagate parallel to the direction of B 0 . • Lorentz in FO generalization [49] where ω p is the termed bulk plasma frequency associated with electrons, and γ is the damping coefficient. This model extends the classical Lorentz model (46) for a dielectric material with the use of FO derivatives, but it is formulated in the frequency domain. • Drude [47] where ω = 0 and ω = jγ. In (50), ω 0 is the Drude pole frequency and γ is the inverse of the pole relaxation time. This model results from the application of kinetic theory to electrons in solids for optical frequencies. It can be obtained from the aforementioned Lorenz model (i.e., harmonic-oscillator model) when the restoration force is removed (i.e., free electrons are assumed which are not bound to a particular nucleus). • Cole-Cole [46,[50][51][52] This model has been developed as an empirical extension of the Debye model (45), which can be obtained for α = 1. • Cole-Davidson [46,53] This model has been developed as an empirical extension of the Debye model (45), which can be obtained for β = 1.

•
Havriliak-Negami [46,54,55] This model extends Cole-Cole (51) and Cole-Davidson (52) models. • Raicu [46,56] where ∆ is the relative dielectric increment in the Raicu model. This model extends (53) by including the additional parameter γ. • Universal dielectric response [57][58][59] where ω = 0 and χ α is a positive constant. In general, this model is valid for ω >> ω p , where ω p is the loss-peak frequency. It describes the observed behavior of dielectric properties demonstrated by solid-state systems. That is, it involves power-law scaling of dielectric properties with frequency, which is widely observed in nature.
A particular dielectric model can be a weighted sum of the several susceptibility characteristics χ(ω) presented above. For instance, the characteristic can be a weighted sum of several Lorentz functions (46), defined for different pole pairs and damping coefficients [7]. In such a case, if each of the components is causal, then the considered model is causal as well.

Frequency Response
If one obtains a causal solution to Maxwell's Equations (23)- (26), then each of the components of the vector field E, D, H, B can depend only on the previous excitation values ρ, J. For instance, the relation between the excitation J and the response of electromagnetic system (E, H) in the frequency domain can be written as follows [8,60]: In (56) and (57), G ee (r, r ), G me (r, r ) denote dyadic Green's functions of electricelectric and magnetic-electric type, respectively. Because the considered electromagnetic-field systems are linear and time-invariant, it is possible to write the relation between a single excitation component (X(ω)) and a single output component (Y(ω)) in the frequency domain. Then, one obtains where G(ω) denotes frequency response of the system. For instance, let us consider one-dimensional (1-D) propagation of a monochromatic plane wave along the z direction in a medium described by material parameters (ω) and µ(ω). Then one can write the following Helmholtz equation forẼ =Ẽ(z)i x and H =H(z)i y : In (59), is the square of complex-valued wavenumber. In optics, the refractive index n(ω) is mainly used to describe an electromagnetic medium, which is a dimensionless number describing how fast the light travels through the medium. That is, is the velocity of light in the medium and c is the velocity of light in the vacuum. Hence, one can also write that k 2 = n 2 (ω)(ω/c) 2 . If the refractive index is a complex number for a given angular frequency, its real part indicates the phase velocity, whereas its imaginary part describes the attenuation of electromagnetic waves in the medium. Let us consider the signalling problem [61][62][63], where the electric field in the homogeneous Equation (59) is excited by a source at the spatial-domain boundary. Hence one obtains whose general solution is given bỹ in (61),Ẽ + andẼ − denote, respectively, complex amplitudes of waves propagating in the +z and −z directions, and +k and −k are complex roots of k 2 . Considering wave propagation (or guiding) in the +z direction only, the propagation constant jk depends on the choice of (ω) and µ(ω) functions, and is selected as the one with a positive real part. Hence one obtainsẼ (z) =Ẽ + e −jkz (62) whereẼ + is equal toẼ(z = 0). Such a solution is physically equivalent to impinging of the plane wave on the half-space constituting a medium described by (ω) and µ(ω). Then the wave is transferred into the medium and its time-domain waveform can be obtained as described in [25]. Alternatively, one can consider (62) as a general solution of the wave-guiding problem for, e.g., optical waveguide. Assuming the fixed length of the wave-propagation distance z = L, and taking X(ω) =Ẽ(z = 0) and Y(ω) =Ẽ(z = L), the Formula (62) can be considered as the relation (58) where Such a function is usually required to be a causal transform in electrodynamics. Furthermore, one can require that G(ω) is relativistically causal. That is, not only the inverse Fourier transform of G(ω) is equal to zero for t < 0, but the inverse Fourier transform of G(ω)e jωL/c is also equal to zero for t < 0.
Let us consider FO models in electrodynamics, which start with constitutive relations as follows [43,44,63]: Equation (43) is repeated here as (65) for the sake of completeness. When these relations are applied to Maxwell's Equations (23)- (26), then one obtains the FO Maxwell's equations in the following form: In order to analyze the 1-D propagation of a monochromatic plane wave along the z-direction, one can apply the phasor-domain representation and arrive at the special version of the Helmholtz Equation (59) with With the additional assumption of the lack of current and charge sources within the considered space, and assuming that there is no power dissipation due to Joule's heating, i.e., the current density is related to the electric-field intensity by the classical Ohm law (α = 1) with the conductivity σ 1 = 0, the frequency response (i.e., the transfer function in the frequency domain) is given by (cf. [63]) where ν = β+γ 2 . For α = 1, σ 1 = 0 and β = γ, one can write (67)-(70) in a compact form is the Riemann-Silberstein (RS) vector in the time-fractional electrodynamics [64] and Assuming the plane-wave, spherical and cylindrical symmetries of solutions to (77), one obtains, respectively, the following transfer functions describing the 1-D wave propagation [64]: In (80), the function J 0 is the Bessel function of the first kind of zero order, i.e.,

Methods and Analytical Tools for Causality Evaluation
Let us consider a complex-valued transfer function G : R → C or distribution, which is the Fourier transform of a certain time-domain function g : R → C or distribution. It is worth noticing that we do not assume a priori that g(t) is real-valued.
Approaches to causality evaluation are presented in Figure 1. Having G(ω), one can evaluate causality by way of applying the Paley-Wiener theorem, calculating the inverse Fourier transformation, finding a holomorphic extension to the right half-plane, or checking various forms of the K-K relations. One should notice that, for the considered function G(ω), not every approach can easily be applied to prove either causality or lack thereof.

Paley-Wiener Theorem
Let us start our considerations from the Paley-Wiener theorem, which allows for characterisation of the modulus of the complex-valued L 2 function in terms of causality.
Theorem 2 (Paley-Wiener, [65] Theorem XII). Let φ(ω) be a real nonnegative function, not equivalent to 0 and belonging to L 2 (R). A necessary and sufficient condition that there should exist a real-or complex-valued function g(t), vanishing for t ≤ t 0 , for some number t 0 , and such that the One should notice that the Paley-Wiener theorem does not state that the complexvalued function G(ω) is a causal transform. It states that, for the modulus φ(ω) satisfying (82), the causal transform G(ω) exists with the same modulus. It also states that if φ(ω) = |G(ω)| does not satisfy (82), then G(ω) is surely not a causal transform. This theorem is a valuable tool, which can be used to prove that the transfer function G(ω) is not a causal transform.

Calculation of Inverse Fourier Transformation
The simplest approach to causality evaluation relies on calculating the inverse Fourier transformation of the function G(ω). Then g(t) = F −1 (G(ω))(t) is not causal if its support is not contained in [0, +∞) (if g(t) is a continuous function, then it is enough to show that it is not equal to zero for a certain t ≤ 0). Alternatively, g(t) is causal. The method seems to be very simple, but it may be really difficult to calculate the inverse Fourier transformation. In some cases, one knows the exact formula for the (inverse) Fourier transform for a given function or distribution. On the other hand, in numerous cases the exact formula is unknown.
In some cases, one can easily prove lack of causality by referring to the properties of the inverse Fourier transform. For instance, if the function G(ω) is an L 1 (R) function, then the inverse Fourier transform is a continuous function. Hence, it is enough to find a single point t 0 ∈ (−∞, 0] such that F −1 (G(ω))(t 0 ) = 0, e.g., to show that F −1 (G(ω))(0) = 0. This idea may not be applied directly when G(ω) is an L 2 (R) function or a distribution which is not represented by an L 1 (R) function. With the definition of the Fourier transformation extended to the above-mentioned domains, one identifies the result up the sets of measure zero. In this case, without continuity of the result, showing that the inverse Fourier transform is non-zero at a single point does not prove lack of causality.

Holomorphic Extensions and K-K Relations
The classical perspective on causality is provided by the Titchmarsh theorem, which works for functions g ∈ L 2 (R) (see Theorem 1.6.1 in [4]): Theorem 3. If a square-integrable function G(ω) fulfils one of the four conditions below, then it fulfils all four of them: (i) The inverse Fourier transform g(t) of G(ω) vanishes for t < 0: is, for almost all v, the limit as u → 0 + of an analytic functionG(u + jv), which is holomorphic in the right half-plane and square integrable over any line parallel to the imaginary axis:ˆ∞ (iii) G and G verify the first Plemelj formula: (iv) G and G verify the second Plemelj formula: One should notice that the relations (83) and (84) hold in the sense of elements of the L 2 (R) space, i.e., the equalities hold for almost all ω ∈ R.
The relations (83) and (84) are also referred to as the K-K relations or the dispersion relations. This theorem delivers two aforementioned approaches to prove causality, i.e, searching for an appropriate holomorphic extension of G(ω) to the right-half plane, and proving the validity of the K-K relations (83) and (84).
If the function G(ω) is the Fourier transform of the real-valued function g(t), then it is hermitian, i.e., it has an even real part and an odd imaginary part. Therefore, the K-K relations (83) and (84) can be formulated for almost all ω ∈ R by the following integrals on (0, +∞): It should be mentioned that the growth assumption in the case (ii) of Theorem 3 (i.e., that the maps G σ (ω) =G(σ + jω) belong to L 2 (R) for all σ > 0, and that all the norms G σ L 2 (R) are uniformly bounded by the same constant C > 0) is vitally important. The sole existence of a holomorphic extension of the function G(ω) may not be sufficient for its causality. The case of G(ω) = e −ω 2 is a good example. This function naturally extends to the holomorphic function e s 2 , where s = σ + jω, while the inverse Fourier transform e −t 2 /4 is not a causal function. The uniform boundedness of the norm G σ L 2 (R) is the violated condition. It is because |G σ (ω)| = e σ 2 e −ω 2 , thus, for σ → +∞, the norm can be arbitrarily large.
The general distributional version of the K-K relations, given as Theorem 5 in [25] (following Theorem 3.10 in [35]), is presented below. This version is a generalization of the well-known K-K relations with subtractions (as described in Section 1.7 of [4]). The procedure of subtractions works for such a Fourier transform G(ω), which is not necessarily in L 2 (R). However, when divided by some polynomial (ω − p) k of degree k, it belongs to L 2 (R). The generalization towards distributions is sometimes required because the division itself can introduce a singularity, resulting in a function which is not locally square integrable (see the discussion in Section 4.2 in [25]). In the distributional version, it is not required that the division result, i.e., F/(jω) k , is in L 2 (R), but it can be a distribution belonging to a class D L q for some q ∈ (1, +∞), which can be broader than the functional space L q (R) (see Example 1 in [25]).

Theorem 4.
Let us assume that F = (jω) k G, where G ∈ D L q . A necessary and sufficient condition that F ∈ S should be the boundary value in the S topology of a functionF(s) ∈ H + , i.e., that F andF are, respectively, the Fourier and Laplace transforms of a distribution f = F −1 (F), with supp( f ) ⊂ [0, +∞), is that either One should notice that (88) can be written with reference to the distributional version of the Hilbert transformation as The other important detail hidden in this theorem is related to the growth condition on the holomorphic extension. One should remember that the conditionF(s) ∈ H + means thatF is holomorphic in C + , and that its growth is controlled by some polynomial as the condition (1) states.
A slightly less general version of Theorem 4, which can be used in the case of distribution F being represented as a locally integrable function, is formulated below.
Theorem 5 (Theorem 6, [25]). Let us assume that k ∈ N ∪ {0} and F : R → C is a function such that F(ω)/(jω) k is a locally integrable function of the growth at +∞ given by for |ω| ≥ M, for some M > 0, C > 0 and for ε > 0. Let us also assume that where P k−1 (ω) is a certain polynomial of the degree k − 1 at most. Then F is a causal transform.
In some cases, the K-K relations can be evaluated for the transfer function logarithm. This attitude can result in sufficient conditions for causality.
The 'holomorphic-extension' approach can be generalized towards functions with a polynomial growth. The first theorem can be found in [35] as Theorem 2.7.
The theorem formulated above gives rise to practical sufficient conditions for causality of the transform G(ω).
Then there exists g ∈ S + such that G = F (g) andG = L(g).
Proof. This is a direct consequence of Theorem 7. Let us take any ϕ ∈ S and notice that The functions G σ are definitely continuous (as sections of holomorphic function), and hence locally integrable. Moreover, each of the integrals´+ for any σ ∈ (0, r] and almost all ω ∈ R, as well as the function G σ (ω)ϕ(ω) is (pointwise, almost everywhere) convergent to the function G(ω)ϕ(ω), we can refer to the Dominated Convergence Theorem (see, e.g., (2.206) in [5]) and state that It means that the tempered distributions G σ converge to G in S topology. Thanks to Theorem 7, we know that G(ω) is a causal distribution.
The set of distributional theorems presented in this section can be collected as a distributional version of the Titchmarsh theorem. It provides the conditions for the tempered distribution f ∈ S to be supported in [0, +∞), due to the properties of its Fourier transform F = F ( f ). Let us mention that each of the tempered distributions F(ω) can be represented as F(ω) = (jω) k G(ω), where G ∈ D L q (see the discussion following Definition 3.2 in [35]).

Theorem 10.
Let us assume that the tempered distribution F ∈ S and F = (jω) k G, where G ∈ D L q . Then, if F fulfills one of the three conditions given below, it fulfills all of them: (ii) the distribution F is the boundary value in the S topology of a functionF(s) ∈ H + ; (iii) the following relation is satisfied for a certain polynomial P k−1 (ω) for the degree k − 1 at most.

Application of Paley-Wiener Theorem
Example 1. As it has been mentioned in Section 4.1 above, the Paley-Wiener theorem is a useful tool to detect the lack of causality. This is the case of frequency response of systems governed by the power law with an exponent ν > 1. Let us take G(ω) as given by (73). One can see that G(ω) ∈ L 2 (R) and |ln|G(ω)|| = z µ γ β cos πν 2 |ω| ν .

Calculations of Inverse Fourier Transformation
Example 2 (Dielectric model with constant permittivity (35)). This model implies that electric susceptibility χ(ω) = r − 1 is constant as well. This directly implies that is a causal distribution.
Example 3 (Dielectric model with ohmic losses (38)). This model is not valid for ω = 0; hence it is extended in the distributional sense as follows: This directly implies that is a causal distribution. (45)). For this relaxation model, one can calculate the inverse Fourier transformation as follows [47] Section 9.2.1:
Example 6 (Lorentz in high-frequency limit-model (47)). This model requires some explanation, which is important from the formal mathematical perspective. The function χ(ω), given by (47), is not locally integrable, so it may not be treated as a distribution. Hence we are not able to treat it as a Fourier transform of any function or distribution (in other way, it is not possible to find the inverse Fourier transformation thereof), which means that we are not able to formally ask whether it is a causal distribution or not. The natural way to treat it as a distribution is to use the principal part formalism (see, e.g., the discussion in Section 10.1 in [5]), which means that one can associate 1 ω 2 with the distribution p. f. 1 ω 2 given by p. f. 1 It might be shown that (in the distributional sense) Dp. v. 1 ω = −p. f. 1 ω 2 . With this identification, i.e., treating − ω 2 p ω 2 as −ω 2 p p. f. 1 ω 2 , one can discuss the causality of this model. That is, the inverse Fourier transform of is given by For t<0, F −1 (χ(ω))(t) = 0; hence (104) is not a causal transform. However, this observation requires some comment as well (refer to the discussion about the Formulas (19a), (19b) and (20) in [67]). Let us notice that, for one obtains Hence the model (47) can be causal if it is extended in the distributional sense as follows: One is also referred to Example 20 given below, where causality of this model is discussed from the perspective of the K-K relations.

Example 7 (Lorentz in high-frequency limit with static magnetic induction-model (48)).
One can notice that the Formula (48) is representable as where ω = 0 and ω = ∓ω B . Because (109) includes singularities, we write it as Then we use the following properties of the Fourier transformation: Hence one obtains Finally, the inverse Fourier transformation of (110) can easily be calculated as However, (110) is not a causal transform. Furthermore, it is not a purely real function. Therefore, in order to obtain a causal transform, we extend (48) with distributional terms which are appropriate multiplicities of δ(ω) and δ(ω ± ω B ). That is Then one obtains in the time domain Although (113) is a causal function, its values are complex in the time domain. Hence it is not a physical formula for any susceptibility.
One is also referred to Example 21 given below, where causality of this model is discussed from the perspective of the K-K relations.
Example 9 (Djordjevic-Sarkar relationship for lossy dielectrics (41)). This model is undefined for ω = 0; hence we extend it in the distributional sense as follows: (ω) = 0   1 + ∆ r log 10 Using the time-domain representation of the function (41) proposed in [6] (see Formula (6) therein with p = 0), one can calculate It is a causal distribution.
The last estimate shows that all the functions |X σ (ω)| are dominated by the function f (ω) = 1.

Hence, by Theorem 8, the function χ(ω) is a causal transform.
For α ∈ (1, 2), one cannot be sure of the sign of cos(αθ). However, the derivations can be continued with a slightly different estimate. Let us review the denominator of (123) For the fixed α ∈ (0, 2), one obtains |αθ| ≤ α π 2 < π. Hence the inequality is obtained for a certain positive constant c α . Two nonnegative terms are obtained in a denominator; hence one can write an estimate Let us notice that α/2 ∈ (0, 1), so the function f (ω) = 1 √ 2c α τ|ω| α/2 is locally integrable and is bounded for |ω| → +∞. Hence it can be considered as a tempered distribution. Moreover, the inequality can be formulated as follows: One can also notice that X(s) ∈ H + . It is definitely holomorphic in the right half-plane, so it is enough to show that it can be estimated by a polynomial in half planes s ≥ σ 0 > 0. Let s = σ + jω with σ ≥ σ 0 . Then one can estimate by a constant. It proves that X(s) ∈ H + ; hence all the assumptions of Theorem 8 are satisfied. Therefore the Cole-Cole model (51) is causal for α ∈ (0, 2). (52)). One can easily notice that, for β = 1, the transform
Example 12 ). This model can naturally be extended towards the function X(s) = ∆ r (1+(sτ) α ) β , holomorphic in the right half-plane C + . Let us denote θ = atan( ω σ ) ∈ (− π 2 , π 2 ), similarly as for the Cole-Cole model. Let us also denote s = σ + jω. It gives because cos(θα) ≥ 0. Hence one obtains the same estimate as in the previously considered models. Similarly as before, by Theorem 8, one can state that the transform is causal.

Example 14 (Lorentz model in FO generalisation
The open question is whether this extension is well-defined over C + . First, let us observe that the quadratic equation with an unknown z = s ω 0 α has the following solutions Hence, no matter if γ 2 − 1 is positive or not, the real part of the solution z is negative. It means that, for all z ∈ C + , the denominator of the function given by (137) is non-zero. If s ∈ C + , then also z = s ω 0 α ∈ C + . Hence the extension (137) is well-defined over the entire half-plane C + . Now we are going to check the assumptions of Theorem 8. First, since the modulus of the denominator is bounded from below by some constant C > 0 in the closed right half-plane C + , one can write the estimate for any σ ≥ 0 and ω ∈ R. Because the pointwise convergence X σ (ω) = χ(ω) as σ → 0 + is obvious, by Theorem 8, the function χ(ω) is a causal transform.
Now, let us take a look at the transfer functions which are derived for the formulation of time-fractional electrodynamics based on the RS vector.
Example 15 (Plane-wave and spherical symmetries of solutions to diffusion-wave equation formulated based on RS vector-models (78) and (79)). For the functions G z and G R , given by (78) and (79), respectively, the appropriate holomorphic extensions exist for β ∈ (0, 1), i.e., The assumptions of the Titchmarsh Theorem 3 are satisfied in the point (ii) for both functions; hence it directly proves their causality.
Example 16 (Cylindrical symmetry of solution to diffusion-wave equation formulated based on RS vector-model (80)). The function G r , given by (80), does not belong to L 2 (R). It stems from the asymptotics of the Bessel function J 0 given by valid for z ∈ C such that arg z ∈ (−π, π) as |z| → +∞ (see Formula 9.2.1 in [68]; more details are given in Chapter VII in [69]). A natural holomorphic extension to the right half-plane for G r (ω) is given by G r (σ + jω) = G r (s) = J 0 (js β r µ β β ).
The behavior in infinity described by (143) does not allow one to state that the function J 0 (js β r √ µ β β ) has a polynomial growth for s ≤ ρ, and a positive ρ > 0. That is why, when trying to prove the causality of the operator, one should refer to the stronger Theorem 9 and treat the function G r (ω) as a tempered distribution. The condition (i) of above-mentioned Theorem 9 is equivalent to the statement that, for any ε > 0, such C = C(ε, ρ 0 ) > 0 exists that ln |G(s)| ≤ ln C + ε|s| 2 . (145) As one can see, due to the estimate (143), the modulus |G r (s)| can be estimated by Ce |s| β as |s| → +∞; hence the estimate (145) is satisfied. Moreover, taking y ∈ R, one can write and because of the trivial estimate | ln(J 0 (jy β ))| ≤ ln(|J 0 (jy β )|) + π (147) one can focus on the estimate of |J 0 (jy β )|. Looking at the terms on the right-hand side of (143), one can see that the dominating term is of the order e A|s| β for some constant A > 0. Hence one obtains On the other hand, by the Hankel's Asymptotic Expansions (see [68] Equation (9.2.5) and [69] Section 7.3), the function J 0 (js β r √ µ β β ) is bounded as |s| → +∞. Hence, the function (144) satisfies all the assumptions of Theorem 9, so it is also a causal transform.

Applications of K-K Relations
Example 17 (Westerlund relationship for FO capacitors [42,43] and power-law relationship for porous media [45]-models (42) and (44)). Both Equations (42) and (44) are in the form for certain constants C > 0 and α ∈ (0, 1). We are going to show that the function is a causal transform (subtracting 1 does not influence causality, since it is a transform of the Dirac delta, i.e., a causal distribution). One can see that Let us refer to Theorem 5 for k = 0. As F(ω) is locally integrable and satisfies growth conditions in ±∞, it is sufficient to check the K-K relation (91), i.e., Let us calculate the Hilbert transforms Let us refer to a certain integral formula taken from (3.241.3) in [70] (valid for p < q) Taking q = 2 and p = −α + 1 < 2, as well as applying the substitution x = τ/ω, one obtains Similarly, taking p = −α + 2 < 2, one obtains Now, we take the Formula (152) and convert it to the integral over (0, +∞) It means that and the real parts on both sides of (151) coincide. Now, we take the Formula (153) and convert it to the integral over (0, +∞) It means that which proves that the imaginary parts of (151) coincide as well. It completes the proof that all the assumptions of Theorem 5 are satisfied; hence the transform (149) is causal. One should notice that the solutions obtainable for various power-law media and time-fractional electrodynamics may not be relativistically causal. That is, for the time-fractional diffusion-wave equation, the propagation velocity of a disturbance is infinite, but its fundamental solution possesses a maximum which disperses with a finite velocity [63].

Example 18.
Let G(ω) = 1 jω . This is the function which is not locally integrable, so not in L 2 (R), but can be identified with a distribution p. v. 1 jω , which is in D L 2 . Then the K-K relations (87) can be interpreted for k = 0 as Let us observe that, by (20), there is Hence the relation (161) is not satisfied. On the other hand, when the modified transform is taken one can see that (referring to (19)) However, it is well-known that F −1 (G(ω)) = 1 2 sgn(t), which is not a causal function, and F −1 (G c (ω)) = 1 2 sgn(t) + 1 2 = u(t), which is a causal function. This fact can seem to be slightly paradoxical. It is because the transform p. v. 1 jω is closely related to the operator assigning an integrable function its primitive. It is natural to consider this operator as a causal operator, as it converts causal integrable functions to causal locally integrable functions. The problem is that multiplication by the function 1 jω does not always correspond to integration in the time domain. Unfortunately, the time-domain integration operator, applied to functions from the space L 1 (R) or L 2 (R) or to functions from the Schwartz space S, does not always give a function from one of these spaces. The condition necessary to obtain the antiderivative operator´t −∞ f (s)ds acting to any of these spaces is that´+ ∞ −∞ f (s)ds = 0, and it is not always satisfied. Under this assumption, one can write where F(ω) = F ( f )(ω). This is the only case when one can associate the multiplication by the function 1 jω with an integration operator. In the more general case of f ∈ L 1 (R) and´+ ∞ −∞ f (s)ds = F(0) ∈ R (not necessarily equal to zero), one obtains However, this formula does not work for all the tempered distributions. For instance, let us take f (t) = u(t), i.e., the Heaviside step function. Then and F (g)(ω) = D(p. v. 1 jω ) + iπδ (ω), where δ denotes the distributional derivative of the Dirac delta, and D(·) denotes the distributional derivative. On the other hand, F(ω) = F ( f )(ω) = p. v. 1 jω + πδ(ω) and none of the Formulas (163), nor (164) can be correct (actually, these are not even well defined, due to the fact that the multiplication of distributions is not well-defined in general).

Example 19.
Let us fix a ∈ R, and check when G(ω) = e −jaω is a causal transform. The answer is obvious, because G(ω) = F (δ(t − a)), which is a causal distribution iff a ∈ [0, +∞). However, we are going to look at the Formula (87) for k = 1 (the value of k is selected as the minimal natural number, such that e −jaω ω k ∈ D L 2 ). The left-hand side of the above inequality is obviously DG(ω) = −jae −jaω . On the other hand, one can write the right-hand side as Let us now calculate the convolution for the test function ϕ ∈ D L 2 Hence, both sides of (166) coincide iff asgn(a) = a, i.e., for a ∈ [0, +∞).
Having reviewed two abstract cases, let us look at some of the models from the perspective of the K-K relations.
Example 20 (Lorentz in high-frequency limit-model (47)). Because we treat the function 1 ω 2 as a distribution p. f. 1 ω 2 (see the discussion in Example 6), one can write It means that χ(ω) ∈ D L q as a derivative of the distribution from D L q . Then it means that the K-K relations (87) should be taken for k = 0. Let us consider In these derivations, the Formulas (18) and (20) are used. Hence one can conclude that the transform χ(ω) is not causal. However, as previously, one can notice that, by adding a singular term to χ(ω), a causal transform is obtained. That is is a causal transform.
As before, one can see that the formula (48) is representable as Let us check the K-K relations for this function. Since χ(ω) ∈ D L 2 , one can refer to (87) for k = 0; hence one can directly refer to (161). By the Formulas (20) and (21), one can notice that It implies that χ(ω) is not a causal transform. However, it can be concluded that adding singular terms in order to obtain results in a causal transform.

How to Prove Lack of Causality?
Obviously, one can prove that an appropriate holomorphic extensionG(s) of the transform G(ω) does not exist, but it does not look like an easy task. One should return to the K-K relations instead, and prove that one of the equalities (83) or (84) (or in the case of hermitian transforms (85) or (86)) is not satisfied in L 2 (R). Because the equalities are given between the elements of L 2 (R) space, it means that violation of any of the conditions (83) and (84) for the transform G(ω) in a single point does not prove that the transform is not causal in general. Equations (83) and (84) are in L 2 sense; hence such equalities are valid almost everywhere. Fortunately, in certain cases, it occurs that the relations (83) and (84) are valid for all ω ∈ R.
We now provide the theorem stated by Wood in 1929 ([71] Theorem I, see also [5] Section 3.4.1): Theorem 11 (Wood,[71] Theorem I). Assuming that f , g : R → R are such functions that t dt exist for certain M > 0; (iii) f is locally Hölder continuous of the order α ∈ (0, 1).
Then the function g is also locally Hölder continuous with an exponent α and holds for all x ∈ R.
Hence, for the Hölder continuous functions G(ω) or G(ω), with a behavior in ±∞ as required by Theorem 11, violation of any of the relations (83) or (84) in a single point proves that they do not hold in L 2 (R). This attitude is used in the following example: Example 22. In [26], the transfer function induced by the two-sided fractional derivative introduced by Ortigueira and Machado (see [72,73]) is considered. It results in the following transfer function: . (175) Hence one obtains As it is shown in Lemma 2 in [26], for this transfer function, as well as for ν ∈ (0, 1) and 1 2 (Θ − 1)ν ∈ Z, the following two equalities hold true: Because both functions are locally Hölder continuous, the K-K relations hold true in L 2 (R) iff they hold true pointwise. Hence, the transfer function G Θ,ν (ω) within the considered range of parameters is not causal.
In the case of distributions, Theorem 11 can also be helpful. Let us first consider the following example: Example 23. Let us return to the case of ν = 3, 5, 7... mentioned in Example 1. We are going to show that the transform where the sign depends on the parity of l ∈ N for ν = 2l + 1, is not causal. As one can see, the distribution belongs to D L 2 . Hence one can check if the distributional relation (89) is satisfied. Let us denote In order to state that G ν (ω) is a causal transform, one needs to check that G ν (ω) = ω(H( f ) + jH(g)).
As one can see, the function f (ω) satisfies the assumptions of the Wood Theorem 11. Moreover, it is an L 2 (R) function, so the distributional Hilbert transform can be replaced by a standard L 2 transform. By the Wood Theorem 11, the real part of the right-hand side of (182), i.e., G ν (ω), is a locally Hölder function. It means that cos(z µ γ β ω ν ) = ±ω(H( f )(ω)) for all ω ∈ R. This is obviously violated for ω = 0, and it proves that (89) is not satisfied. Hence G ν (ω) is not causal.
This example can be a good starting point for some general observations in the context of Theorem 4, providing an easily verifiable necessary condition for causality. Theorem 12. Let us assume that F = (jω) k G, where G ∈ D L q , and that (89) holds, i.e., Then, if the real (respectively imaginary) part of G is a locally Hölder continuous L q function, then its imaginary (respectively real) part must also be a locally Hölder continuous L q function.

Causality Tests for Refractive Index
One can formulate the causality problem for media described by the refractive index n = n(ω) ∈ C. From the point of view of Maxwell's equations, the velocity of electromagnetic wave is described by (ω) and µ(ω). Hence one obtains that n 2 (ω) = cµ(ω) (ω), refer to (59). Let us assume that the dielectric-relaxation function χ e (ω) satisfies the assumptions of the Titchmarsh Theorem 3. Therefore, among others, it belongs to L 2 (R) and is causal. Then χ e (ω) has a holomorphic extension to the right half-plane, and the permittivity (ω) = 0 (1 + χ e (ω)) also has a holomorphic extension to the complex right half-plane (the same considerations are applicable to the permeability µ(ω) = µ 0 (1 + χ m (ω))). One should notice that this condition formulated for e −iωt settings implies that the considered function is holomorphic in the upper half-plane. We should mention that the existence of holomorphic extension is not a sufficient condition for causality, i.e., the behavior in ∞ is important as well. In general, assuming that χ e (ω) and χ m (ω) belong to L 2 (R) does not imply that the product χ e (ω)χ m (ω) ∈ L 2 (R). Hence, one has to be careful when deciding whether n 2 (ω) − 1 is an L 2 (R) function. Similarly, if the holomorphic extensionsχ e (s) and χ m (s) satisfy the assumptions (ii) of Theorem 3, it does not mean that these assumptions are satisfied by the productχ e (s)χ m (s). This means that we may not form conclusions about causality based only on the existence of holomorphic extension. Some assumptions concerning the behavior of the productχ e (σ + jω) ·χ m (σ + jω) for a fixed σ > 0 and |ω| → +∞ are needed as well. As proposed by Stockman [27], the K-K relations can be written for the complex refractive index as Although the assumption χ e ∈ L 2 (R) does not imply that lim |ω|→+∞ χ e (ω) exists, in practical terms it is natural to assume that lim |ω|→+∞ χ e (ω) = 0. Due to this assumption regarding the electric susceptibility χ e (ω), as well as the magnetic one, one obtains that n 2 (ω) → 1 when ω → +∞ [74]. The usual physical justification behind this assumption is that an incident, oscillating, electromagnetic field entering any medium stimulates the charges in that medium to oscillate (light-matter interaction). However, for very high frequencies of the incident field (ω → +∞) the charges of the medium cannot respond, because they have a finite mass, hence their inertia. As a result, for those very high frequencies, it is as if the field 'sees' a vacuum (n 2 (ω) → 1), because it effectively does not interact with the medium at all.
From this additional assumption (i.e., lim |ω|→+∞ χ e (ω) = 0, lim |ω|→+∞ χ m (ω) = 0), as well as the assumption that both functions χ e and χ m are bounded (it happens, e.g., when they are continuous), one can conclude that the product χ e · χ m belongs to L 2 (R). In this case, one knows that n 2 (ω) − 1 belongs to L 2 (R). It allows one to say that the relations (183) and (184) imply causality of the transform n 2 (ω) − 1 by the classical Titchmarsh Theorem 3. From the formal perspective, the assumptions related to limits of the functions χ e , χ m as |ω| → +∞ and their boundedness can be relaxed (e.g., by saying that one of these functions is any L 2 (R) function, and the other one is a bounded L 2 (R) function), but it is far less natural and does not allow to conclude that n 2 (ω) → 1 when ω → +∞.
A separate discussion is needed when the susceptibilities χ e (ω) and χ m (ω) are not L 2 (R) functions, e.g., when they are tempered distributions. In this case, one has to be careful when defining the product n 2 (ω) = c(1 + χ e (ω))(1 + χ m (ω)), as the product of two distributions is not necessarily well-defined. One can surely define n 2 (ω) as a tempered distribution when both χ e (ω) and χ m (ω) are represented by locally integrable functions, which are bounded as |ω| → +∞ by some polynomial. In this case, the K-K relations can be checked in the distributional sense with the use of Theorem 4, but it requires dividing the function n 2 (ω) or n 2 (ω) − 1 by an appropriate power of (jω). Let us also notice that the K-K relations for n 2 (ω) and n 2 (ω) − 1 can be different. In the case considered above, when n 2 (ω) − 1 belongs to L 2 (R), there is no need to divide n 2 (ω) − 1 by any positive power of (jω) (i.e., one may take k = 0 in the Formula (87) or (88)). On the other hand, n 2 (ω) is not the function in D L 2 , and in order to verify any of the Equations (87) or (88), one should take k = 1.
In literature one can also find the K-K relations formulated for n(ω) = n 2 (ω) However, as it can be noticed in [27], the function n(s) = n 2 (s) can be not holomorphic in the right half-plane s ∈ C, even if n 2 (s) is holomorphic. For instance, when n 2 (s) approaches zero, then the derivative of its square root does not exist. It clearly shows that causality of n(ω) is not necessarily equivalent to the case of n 2 (ω) causality. Apart from the problem with being holomorphic in C + , there is also the problem of its behavior as |ω| → +∞. If n(ω) − 1 ∈ L 2 (R), then it does not necessarily mean that n 2 (ω) − 1 ∈ L 2 (R). Moreover, if n 2 (ω) − 1 ∈ L 2 (R), then it is not necessarily true that n(ω) − 1 ∈ L 2 (R). In order to draw any causality conclusions from the relations (185) and (186) formulated for n(ω), one should know that n(ω) − 1 belongs to L 2 (R). Despite these issues, the K-K relation of the type (185) is successfully used for the interpretation of experimental results in [75]. That is, the effect of a femtosecond-laser-induced electronic band-gap shift on the refractive index is explicitly studied with the use of the K-K relations. Clearly, from this relation, a change in the absorption described by n(ω) curve in turn affects n(ω). It is worth mentioning that the K-K relations (185) and (186) formulated for n(ω) are valid for a single mode propagation in waveguides (e.g., optical), for which one can write n(ω) = k/k 0 where k 0 = ω/c.
Let us now assume that n(ω) − 1 = n 2 (ω) − 1 is causal (e.g., it is an L 2 (R) function which satisfies (185) and (186)). Then, if N(t) = F −1 (n − 1) is such a distribution that the convolution N * N exists, and if it is a tempered distribution (the convolution of two causal distributions is causal as well), then (n(ω) − 1) 2 = F (N) · F (N) = F (N * N) is also causal. However, the opposite theorem is false in general. Nevertheless, if one knows that the function n 2 (s) is holomorphic in C + and it does not achieve 0, then n(s) is holomorphic as well.
Still, one can prove that the passivity of media implies that n(ω) = n 2 (ω) is holomorphic in the right half-plane [76]. The problem of choice between n(ω) and n 2 (ω) for applications in optics is debated in [77]. That is, the consideration of propagation of optical pulses with the use of complex index of refraction is inconvenient in general. Therefore, when calculating the wave vector, one can take either n 2 (ω) or n 2 (ω) as a velocity of wave propagation. Whereas the difference between the two is negligible for small losses, it is significant in other cases. The analysis of pulse propagation demonstrates that the use of n 2 (ω) results in a wave vector different than that actually exhibited by the propagating pulse. On the other hand, the definition n 2 (ω) always correctly calculates the wave vector of pulse, hence it is preferred in optical investigations. Moreover, for negative refraction media, when the sign of permittivity or permeability changes as a frequency function, one should notice that the derivative of n(ω) does not exist when n(ω) approaches zero. One should take all these issues into consideration when either the K-K relations (183) and (184) or (185) and (186) are applied.
It is worth noticing that the K-K relations are also applicable-to a certain extent-in nonlinear optics. That is, the response function in the time domain should also be causal for nonlinear media. In general, nonlinear complex susceptibilities can have poles not only on a half of the complex-frequency plane; however, there are cases when it happens. In such a case, when holomorphic properties are available on a half of the complex-frequency plane, the standard K-K relations (83) and (84) can be useful [78]. The review of nonlinear K-K relations in optics and photonics is presented in [79], where it is shown that the nonlinear dispersion relations have a common form that can be understood in terms of the linear K-K relations (83) and (84) applied to a new electromagnetic system consisting of the material and the perturbation of its parameters. As noticed in [80], the nonlinear K-K relations are useful in optics and photonics to predict that an enhancement in the nonlinear optical absorption for a specific wavelength usually leads to a decrease in the nonlinear optical refraction associated with a considered material.
It is derived assuming that, at and near the observation frequency ω, the material is transparent (e.g., the losses are compensated by gain), which mathematically implies that [n 2 (ω)] = 0 and ∂ [n 2 (ω)]/∂ω = 0. Furthermore, it is assumed that the negative refractive index is implied by the condition v p · v g < 0, where v p and v g denote, respectively, the phase and group velocity. Due to these limitations, the condition (187) is replaced in [28] by the condition which does not require that [n 2 (ω)] = 0 and ∂ [n 2 (ω)]/∂ω = 0 at the observation frequency. The condition (187), obtained from the K-K relations, implies that compensation of the optical losses or significant reduction, by any means (material or structural) of the imaginary part of permittivity and permeability, can also change the real parts of these quantities in such a way that the negative refraction disappears [27]. Concerning (183) and (187), care should be exercised for the case when both the real part of permittivity (ω) and the real part of permeability µ (ω) are negative-simultaneously and within the same frequency region, because in that case, although the product (ω)µ (ω) is positive, one should nonetheless select the root with negative sign of the real part of n 2 (ω) and ensure that, in the absence of a gain mechanism the medium does remain passive [81]. Further, one can notice that (187) stipulates that it is impossible to have a loss-free or amplifying ( (ω) ≤ 0, µ (ω) ≤ 0) medium with a negative refractive index ( (ω) < 0, µ (ω) < 0, n (ω) = n(ω) < 0) for all the frequencies-or else (187) would not hold true. However, (187) does not preclude the possibility that such a medium exists within a finite-bandwidth frequency region, outside which the product in the numerator in the integral of (187) could make a sufficiently 'negative' contribution, so that the overall (187) still holds true. Indeed, such lossless negative-refractive-index media have been reported in the past, both experimentally [82] and numerically [83][84][85].

Summary
The described analytical methods for causality evaluation of dielectric models of photonic materials are summarised in Table 1. Table 1. Summary of causality evaluation methods for photonic materials (IFT-inverse Fourier transformation, KKR-K-K relations, HE-holomorfic extension, PWT-Paley-Wiener theorem).

Conclusions
In this article, a comprehensive analysis of mathematical techniques for causality evaluation of photonic materials is presented. It includes not only the approaches valid for the L 2 functions, i.e., those for which the Titchmarsh theorem can be useful, but also the functions to which the distribution theory and the FO calculus have to be applied. We present a set of theorems applicable for causality evaluations, as well as specific examples showing how to use this mathematical machinery. Furthermore, the set of various distributional theorems presented in literature is collected as the distributional version of the Titchmarsh theorem, allowing us to evaluate causality of complicated electromagnetic systems on a mathematically rigorous basis. In addition to the well-known K-K relations, we have also outlined four further methodologies, namely application of the Paley-Wiener theorem, calculation of the inverse Fourier transformation, identification of holomorphic extensions to the right half-plane, and check of the K-K relations for the natural logarithm of a system's frequency response. The collection of these methodologies-otherwise scattered in a wide range of pertinent literature-may prove useful for scientists and engineers investigating causality problems in electrodynamics and optics.